Analysis of $p$-Laplacian Regularization in Semi-Supervised Learning
نویسندگان
چکیده
We investigate a family of regression problems in a semi-supervised setting. The task is to assign real-valued labels to a set of n sample points, provided a small training subset of N labeled points. A goal of semi-supervised learning is to take advantage of the (geometric) structure provided by the large number of unlabeled data when assigning labels. We consider a random geometric graph, with connection radius ε(n), to represent the geometry of the data set. We study objective functions which reward the regularity of the estimator function and impose or reward the agreement with the training data. In particular we consider discrete p-Laplacian regularization. We investigate asymptotic behavior in the limit where the number of unlabeled points increases while the number of training points remains fixed. We uncover a delicate interplay between the regularizing nature of the functionals considered and the nonlocality inherent to the graph constructions. We rigorously obtain almost optimal ranges on the scaling of ε(n) for the asymptotic consistency to hold. We discover that for standard approaches used thus far there is a restrictive upper bound on how quickly ε(n) must converge to zero as n→∞. Furthermore we introduce a new model which overcomes this restriction. It is as simple as the standard models, but converges as soon as ε(n)→ 0 as n→∞.
منابع مشابه
Robust Image Analysis by L1-Norm Semi-supervised Learning
This paper presents a novel L1-norm semisupervised learning algorithm for robust image analysis by giving new L1-norm formulation of Laplacian regularization which is the key step of graph-based semi-supervised learning. Since our L1-norm Laplacian regularization is defined directly over the eigenvectors of the normalized Laplacian matrix, we successfully formulate semi-supervised learning as a...
متن کاملRegularized Semi-supervised Classification on Manifold
Semi-supervised learning gets estimated marginal distribution X P with a large number of unlabeled examples and then constrains the conditional probability ) | ( x y p with a few labeled examples. In this paper, we focus on a regularization approach for semi-supervised classification. The label information graph is first defined to keep the pairwise label relationship and can be incorporated wi...
متن کاملLinear Manifold Regularization for Large Scale Semi-supervised Learning
The enormous wealth of unlabeled data in many applications of machine learning is beginning to pose challenges to the designers of semi-supervised learning methods. We are interested in developing linear classification algorithms to efficiently learn from massive partially labeled datasets. In this paper, we propose Linear Laplacian Support Vector Machines and Linear Laplacian Regularized Least...
متن کاملSemi-supervised Learning by Higher Order Regularization
In semi-supervised learning, at the limit of infinite unlabeled points while fixing labeled ones, the solutions of several graph Laplacian regularization based algorithms were shown by Nadler et al. (2009) to degenerate to constant functions with “spikes” at labeled points in R for d ≥ 2. These optimization problems all use the graph Laplacian regularizer as a common penalty term. In this paper...
متن کاملOn the Effectiveness of Laplacian Normalization for Graph Semi-supervised Learning
This paper investigates the effect of Laplacian normalization in graph-based semi-supervised learning. To this end, we consider multi-class transductive learning on graphs with Laplacian regularization. Generalization bounds are derived using geometric properties of the graph. Specifically, by introducing a definition of graph cut from learning theory, we obtain generalization bounds that depen...
متن کاملLecture 6: Manifold Regularization
We first analyze the limits of learning in high dimension. Hence, we stress the difference between high dimensional ambient space and intrinsic geometry associated to the marginal distribution. We observe that, in the semi-supervised setting, unlabeled data could be used to exploit low dimensionality of the intrinsic geometry. In order to formalize these intuitions we briefly introduce the mani...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1707.06213 شماره
صفحات -
تاریخ انتشار 2017